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ABSTRACT 


A considerable volume of large computational computer codes 
have been developed for NASA over the past twenty- five 
years. This code represents algorithms developed for 
machines of an earlier generation. With the emergence of the 
vector supercomputer as a viable, commercially available 
machine, an opportunity exists to evaluate optimization 
strategies to improve the efficiency of existing software. 
This result is primarily due to architectural differences in 
the latest generation of "large-scale" machines and the 
earlier, mostly uniprocessor, machines. This report 
describes a software package being used by NASA to perform 
computations on large matrices, and describes a strategy for 
conversion to the Cray X-MP vector supercomputer. 
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IH1EQDUCTIQH 


The FORMA (Fortran Matrix Analysis) software package was 
developed by Martin-Marietta approximately twenty years ago. 
This package has been adapted by NASA for use by the Flight 
Dynamics Laboratory at MSFC in solving large structures 
response equations: 


-W V M + S<|>= 0 


Where (b = Mode 

M = Mass Matrix 
S = Stiffness Matrix 
W = System Eigenvalues 


L (T) = A 




+ CX + DF(T) 


+ E 


Where L = Load Matrix 

A, . . . ,E = Constant 

T = Time 

X = Position 

F = Forcing Function 

and Maximum Dimensions = (12,000 X 12,000) 

Typical Matrix Dimensions = (500 X 500) 
Atypical Matrix Dimensions = (5,000 X 5,000) 

Original FORMA codes were adapted for execution on the 
MSFC UNIVAC HOB Multiprocessor . These codes have been 
"ported" to a next -generation UNIVAC machine, then the IBM 
3084, and now the Cray X-MP. Conversions were accomplished 
in a minimum of time, but without attention to optimization 
strategies regarding the host machines. The Cray is 
particularly sensitive to vector constructs within programs. 
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OBJECTIVES 


Develop and adapt specialized mathematical/engineering 
techniques or methodologies to the solution of scientific/ 
engineering problems utilizing supercomputer technology. 
Mathematical analyses and modeling of large computerized 
programs will be performed and recommendations for optimizing 
the solutions will be formulated. Oral and written reports 
will be presented/developed on research activities and 
results . 
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THE COMP UTI NG ENVIRO NME NT 


The Engineering Analysis and Data System (EADS) provides 
the Cray user at MSFC with a front-end to the supercomputer 
mainframe. Jobs submitted to the Cray are submitted through 
EADS. Figure 1 shows the system configuration for EADS. 

The portion of EADS which is important to Cray/FORMA 
users is shown in Figure 2. Also included as part of this 
figure are the three general areas of concern in optimization 
studies for codes executing on the Cray. 
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SYSTEM CONFIGURATION 



XVI- 4 


FIGURE 1. EADS SYSTEM 














THE FORMA SYSTEM ENVIRONMENT 
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FIGURE 2. FORMA ENVIRONMENT 




QPTIMIZATIQN_STUDIE£ 


The FORMA (Fortran Matrix Analysis) software package 
consists of the following: 

105 MATRIX ANALYSIS SUBPROGRAMS: 

o 42 Arithmetic Subprograms 

o 45 Matrix Manipulation Subprograms 

o 12 I/O Utility Subprograms 

o 6 System Utility Subprograms 

The FORMA subroutines are characterized by the 
attributes listed here: 

O MODULAR FORTRAN STRUCTURE 

The average arithmetic routine is 180 statements 

The average matrix manimulation routine is 80 
statements 

The average I/O utility routine is 30 statements 
The average system utility routine is 10 statements 


0 ARITHMETIC STRUCTURE 

Matrices as large as 12,000 X 12,000 are processed 
by using submatrices of dimension 60 X 60, plus 
residues 


0 SUBPROGRAM DEPENDENCIES 

The average subprogram requires 5 arguments in call 
statement . The average subprogram call 3 other 
subprograms . 
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O VECTORIZATION 

All vectorization is presently the result of 
compiler-generated codes. The average subprogram 
contains approximately 2 vector loops set up in this 
fashion. 

The optimization for vector processing will by very 
sensitive to the existing FORMA subprograms; however, the 
Cray X-MP architecture is equally important . Figure 3 shows 
the basic register configuration for the Cray X-MP. The 
references at the conclusion of this report provide detailed 
specifications on the architecture and COS operating system. 

Of particular importance in the optimization process is 
the organization of the 8 64-word vector registers and 
associated vector functional units. The peak computing 
speeds achievable by the Cray are principally attributable to 
sustained vector computations. 

The existing FORMA subprograms should be analyzed for 
the following optimization factors: 

o Subroutine /function calls 

o Loop indices and addressing of arrays 

o Order dependencies and recursions 

o use of scalars in do loops 

o Decision processed 

o Restructuring do loops 

o General rules 
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Each of the optimization factors is now broken down into 
a more detailed list of do's and don'ts relative to 
vectorization: 

CHECK GENERAL RULES 

o Avoid double precision; 

o use memory interleaving; 

o Avoid integer divides; 

o Use parentheses; 

o Avoid mixed mode expressions. 

CHECK SUBROUTINE /FUNCTION CALLS 

o Isolate non-vectorizable function CALLS; 

o Separate D) loops for non-vector functions; 

o Remove (nonrecursive) SUBR CALLS from DO loops; 

o Use statement functions; 

o Convert function CALLS to user vector functions. 

CHECK ORDER DEPENDENCIES -RECURS IONS : 

9 Simple subscripts help compiler to recognize 
vectorizable loops; 

o Vectorize code on non-recursive loop indices; 

o Recognize order-dependencies — these are recursions 
which can be reordered to remove the dependence 
on order; 

o Truly recursive operations should be placed in 
separate DO loops; 

o Optimize when vectorize is not possible. 
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CHECK DECISION PROCESSES: 


o Remove loop-independent IF statements from DO loop; 

o Remove IF tests on loop indices and adjust loop bounds 
accordingly; 

o Create separate loops for "low-probability" decision 
statements involving loop indices; 

o Use temporary variable outside DO loop range for 
"low-probability: decision statements; 

o Avoid the computed GOTO; 

o IF-THEN-ELSE is not vectorizable ; 

o Restructure conditional statements according to 
"density of the decision process"; 

o Perform both halves of condition and then select 
proper results (mask undesirable ones) ; . 


CHECK RESTRUCTURING DO LOOPS: 

o Even if additional calculations required, remove 
scalar statements from DO loops; 

o Use vector length of 64 whenever possible; 

o Make longer loops the innermost loops; 

o If possible, convert nested DO loops into a single DO 
loop; 

o Always combine DO loops of equal length; 
o "Unroll: small outer loops; 
o "Expand" small inner loops. 
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CHECK THE USE OF SCALARS IN DO LOOPS: 

o Check reduction functions, which result in scalars; 

O use MIN, MAX, IMIN, IMAX functions; 
o Check dot products, which result in scalars; 
o Use the SDOT functions; 

o Check matrix multiplication, whihc results in a 
reduction from 2 matrices to a single matrix; 

o Use matrix multiplication kernel which Hows maximum 
vectorization (see example); 

o Convert scalar reciirsions to vector arrays; 

o Do not use loop indices in loop calculations. 

CHECK LOOP INDICES AND ADDRESSING OF ARRAYS: 
o Check indirect addressing; 

o Avoid use of indirect addressing in generating more 
compact codes; 

O Use GATHER /SCATTER functions; 
o Sparse matrices are exception; 

o Whenever possible, repeated indices should have 
constant "stride"; 

o No complicated expressions for loop indices; 

o Repeated memory references which differ by 8 or 16 
locations can cause memory bank conflicts. 
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OP TIMIZATI ON STRATEGIES 


There are several approaches to accomplishing the 
conversion of existing, non-vectorized computer codes to 
obtain more efficient Cray X-MP programs. In this section, a 
short-term strategy will be suggested and an example analysis 
will be discussed. In addition, a long-term conversion 
strategy will be outlined, along with a general optimization 
procedure . 

Figure 4 is a flowchart of a short-term optimization 
procedure which addresses the conversion of more critical 
subprograms on a priority basis . This flowchart is specific 
to the FORMA software package, and when the procedure is 
followed for a typical job stream, we obtain the following 
results : 

1. FORMA routines have been classified one time (this 
step not part of a loop) and documented, noting 
several key parameters and briefly describing 
function . 

2. Typical job stream obtained from System Response 
Branch (ED22). This program calculates a response 
matrix and requires approximately 25 CPU-SEC to 
execute . 

3. Flow trace utility provides the following 
statistics : 


Subprogram 

% 

Subprogram 

Name 

Runrliiae 

-Function 

RESPONS 

2.33 

Main Program 

NTRANI 

7.73 

I/O Utility 

NTRANR 

11.00 

I/O Utility 

ZRDISK 

3.78 

I/O Utility 

ZWDISK 

1.80 

I/O Utility 

ZMULX1 

35.63 

[Z] = [A] * [B] + [Z] 

ZMULT 

28.60 

[Z] = [A] * [B] 

ZMAXMN 

2.52 

)T max = max [ R ] 

SOLVEQ 

1.86 

a£!L+ b4& + CX - 0 
AT- 

OTHER 

4.75 

36 other subprograms 
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OPTIMIZATION STRATEGY 
(SHORT TERM) 


1. CLASSIFY & DOCUMENT 
FORMA 
ROUTINES 

2. OBTAIN 

TYPICAL JOB STREAM 

1 

3. ISOLATE HIGH RUN-TIME 
PERCENTAGE SUBPROGRAMS 

1 

4. APPLY 

OPTIMIZATION TECHNIQUES 
TO 

SELECTED SUBPROGRAM 



'l 

END 

J 


FIGURE 4. SHORT-TERM OPTIMIZATION STRATEGY 
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4. ZMULX1, ZMULT Optimization: 

Since these are similar routines, optimization 
methods will be similar; 

Restructure vector loops: one in each; 

Isolate subroutine calls, especially I/O; 

Use of scalars in DO loops; 

Vectorize decision processes, if appropriate; 

General rules. 

5. We shall treat the discussion of block number 5 in 
the optimization strategy by showing a typical 
analysis process involving matrix multiplication. 
First, consider the "normal" matrix multiplication 
program segment : 

DO 10 i= l , N 
DO 10 J= l.N 
A (I.J) = 0.0 
DO 10 K= l.N 

10 A (I.J) = A( I , J) + B(I,K) * C (K,J) 

Then consider a "better multiply kernel which allows the 
Cray compiler to set up more efficient vector calculations: 

DO 9 J=1 , N 

DO 9 1=1, N 

A (I,J)=0.0 

9 CONTINUE 

DO 10 K=1 , N 

DO 10 J=1 , N 

DO 10 1=1, N 

A(I, J)=A(I, J) + B( I , K) * C(K,J) 

10 CONTINUE 

Notice that the vectorized code is not as compact, but 
it allows the Cray to perform two vector calculations at the 
innermost loop of both nested-DO's. 

Figures 5 and 6 show the ZMULT and ZMULX1 routines which 
were found to be the highest-run-time subprograms in our 
typical run stream. The reader should compare the DO loop 
structure discussed above with these figures. 
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CALL ZBEGIN(NMSA*NRAfNCA? NRP A* NCPA t NR LA » NCLA?INDA ?MHA) 
"CTtX: Z BFGTNtTTMSBTNR B , NC 8 , NR P 8 J KtP B » NR LB f NC LB r I NDB )" 

CALL ZBEGIN(NMSZ*NRZ,NCZ»NRPZfNCPZfNRLZ,NCLZ,INDZ*MHZ> 

^NERRU R-l 

IF ( NC A .LT. NRB) GO TO 999 


IFCNRA .NE. NRZ) GO TO 999 

NER R 0 R = 3 

IF ( NCB .NE. NCZ) GO TO 999 

DO “150 IRP A- IYNRPA' 

NR5 A=KRCPRT 

IFCI R PlTTFljr~NRT-A ) NRS /T=NRlTA 

CALL ZREDI ( INDRP A , 200 f INDA ( IRPA)) 

CA LL ZREDI (TNDRPZT2Gfr7TNDZ ( I R F A ~)~) 

DO 1 AO J C P B = 1 f NCPB 

NCSB = <RCPRT 

IF ( JC PB .F.Q. NCPB) NCSB-NCLB 

X F Ci NDR rz cjc P B) - ; l e . o r G O T O IC O 

CALL ZREDR( SZrKRCPRT^KRCPRT t INDRPZ( JCPB ) ) 

SCTTO-XIO 

103 CALL DZERO(SZ»NRSA» NCSB »KRCPRT) 

~nO"FONTrNUE 

DO 130 IRPB=1,NRPB 


IF ( IRPB -EQ. NRPB) NC$A=NRLB 

IfWDRRACrRPB-)— . L£. 0) SO T 0" 13 0 

CALL ZREDI (INDRPBt 200 t INDB ( IRPB > > 

IFt TNPRPB CJCPB ) i X E » 0 ) G O TO ' 13 0 

CALL ZREDR<SA,KRCPRT*KRCPRT*iNDRPA(IRPB)> 



DO 

1?0 1 = 1* NR 5 A 


*~U~. 

-~Btr 

'lZO-73iT7Nr VS- 


: Rp- 

DO 

120 L= : l » NC S A 



It JJ+5A(l 9 l 

NOVECTOR - REPLACED BY CALL TO "SSDOT* o$cto$oa«.$*o**o$**o**octO**o***[**0<|*o*^p*0005045C 

: : --- T30" CONTINUE — ' 

: : : CALL CHK2:ERCSZ»NRSA»NCSBtIFZER0?KRCPRT) 1 

: : IFCTFZERO .CE. 0 ■7AND^~TRDR"PZT3CPB ) . GT. 0) 

• • «* INDRP Z( JCPB) =-INDRPZ( JCPB) 

i r— itttftfrtj rui • 0) 

J _ <*CALL ZWRrR(SZfKRCPRT*KRCPRTt INDR PZ( JC PB ) ) 

: : : 140 CONTINUE — 

_ 150 CALL ZWRTI(INDRPZ ? 200flNDZ(IRPA)) 

CALL ZCLEAN(NMSZ,INDZ7MHZ1 * 

RETURN 

999-CAXlTT7FT)Wr»“TNaai * SNERROR) “ 

END 

ONT LINE DO LOOP REPLACED AT ScQ. NO. 397 P'= ZDT5 


FIGURE 5. ZMULXl SOURCE CODE 
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SUKHQU TINF ZMUL T ( NMS A 9 NMS B f NMSZ ) 

DOUBLE PRECISION S A , SB f SZ t S , SS ™ 

COMMON /L IX/ INDA(Z04)tINDRPA(200) f MHA(10)rSA(60f60) 


4. 

5- 


COMMON /LZ2/-INUB12D4), INDRPBL20U) ,MHB(10 > tStt LbU» W) 

COMMON /LZ3/ IND Z ( 20 4) , I NDRPZ( 200 ) , MHZ( 10 ) f SZC 60 » 60 ) 


6 - 

— — - 

DIMENSION X(60 r 60) 


7. 


DATA KRCPRT/60/ 

- . 



C 

C MATRIX MULTIPLICATION FOR PARTITION-LOGIC. (A) * (B) » (Z). 

. 


ZREDRtZWRTIfZWRTRtZZBOMB. 


DEVELOPED BY RL WOHCFN. AUGUST" l9774~ 

LAST REVISION BY JOHN ADMIRE. FEB 1982. 

TMPCEM- N TED "ON IBM 3084 BY DAV1LT5. HCGHkt* MAKLH 1V06. 


SuuRuUj iNfc akgumln i 5 (All - INPTJT7 
NMS A = PARTITION- LOGIC NAME FOR MATRIX (A)* 

“NMSB * ~PARTTTTUN^LOGIC NAME FOR MATRIX ( BJT* ' 

NMS Z = PARTITION-LOGIC NAME FOR MATRIX (Z). 


NERROR EXPLANATION 

"i £ MTT t rrr'FS'~nn xmrrrr are nut cumpai able silt. - 


RE - AD MATRIX (A) HEADER* ~~ 

CALL Zf’EG I N(NMSA f NR ApNCA,NRPA»NCP AfNRLAtNCLAf INDAfMHA) 


READ MATRIX <S) HEADER- 

C ALL" 7 BEGIN! 'N rt SBT NRG r NCB t NR PB ? NLP B > NR LB * NCLB ? I NDB ,MHB)~ 


10. 

11 . 


CHECK (A) AND (R) MATRICES FUR SIZ1P COMP^TTBTLTTYT“ 
IF (NCA .NE. NR3) G0"TCnsr99 


NERROR=: 


12 . 

T3S 

14- 


-TnTTM^UTR'TX -IZr"HcAOERT 
NR Z=NR A 
NCZ-NC B 


CALL ZOPEN(NMSZ,NRZ,NCZ» NRPZ,NCPZ # NRLZ, NCLZ* INDZr MHZ) 
MULTIPLY MATRICES (A) AND (B). 


T5T 

16- 

NR5 A=KRCPRT 

17- 

IB- 

— IFdRPA • E Q - NRPTO NITSTC^NJUrA ~ 

CALL ZREDHINDRPA. 200» INDA( IRP A ) ) 

19. 

20. 


CALt ZREDI i TNDRPZ jZOOTXNDZXTKK A J J 

-m 

22. 

23. • 

24. 

25. 

26. 


" ‘ ‘ "NC SB -Kk CPR T 

IFtJCPB *EQ« NCPB ) NCSB=NCLB 


- CALL ZREDR ( SZ ? KRCPRT^KRCPRT t TNDRP Z ( JC PB ) ) 

IF SZ = 0 


• 

--------- DO 24 JCPA-1 f NCP A" 

NCSA = KRCPRT 

— 279 

23 • 



* “TFIJCPA .EQ. NCP9T) NCSA=NCLA ” 

IF(INDRPACJCPA) • LE • 0) GO TO 24 

. 

30. 



C ALL ZREDI ( INDftP8 r200» INDd(^3CPA ) ) 

IF ( I NDRPB ( JCP B ) -LE* 0) GO TO 24 

31 . 

32. 



JFSZ-1 

CALL ZREDRCSA,KRCPRT*KRCPRT,INDRPA( JCPA)) 

— m 






34 • 
35. 





36. 

37. 

1 

■ 

9 


— — D T J 5000 I r NLS A 

V— DO 5000 1 = 1 f NR SA 

334 : 

39. 

8 

■ 

. 



— V S 7 ( 1 7J ) = S7CTVJ ) +5 ATT , K ) o 5 B ( K r J7 

— V — 5000 CONTINUE - * 

474 ’ : 

41. 

■ 

l 

r- ■ “ ■ ELSE 

CALL MXM (SAfKRCPRTfSB?KRCPRT?X*KRCPRT) 

~474 i 

43. 

9 

9 

— 

r-^— DO 5010 J = 1 f NCSB 
V— DO 5010 1 = 1 »NRS A 

W ' 

45. 




v sz(r,J)=sz(r,J)ocnr,'j) 

— V— 5010 CONTINUE 

46. 



END TF 

c 

”474 

43. 



IF C IFSZ .£3. 0) GO TO 26 

494 

50. 

5 1 4 

^2. 

— 

• "■ 'C ALL "'CUKZtR'ISZTNRSArNCSBf IFZERD4 KRCP RT ) 

: IF { IF Z ERO .EQ. 0) GO TO 26 

i CALL ZWRTRC SZTKRCPRTOKRrPTlTTINDRPZraCFFn 

53. 



54 - 


CALL ZCLEANCNMSZTINDZTMiTZT" 


RETU* N 


FIGURE 6 . ZMULT SOURCE CODE 
XVI-16 


ORIGINAL PAGE IS 
OF POOR QUALITY 




The reader should also note that the Cray compiler has 
provided printout information showing all program loops, 
which are very important in the vectorization process. The 
compiler also marks each loop to inform the user of the 
vectorization which can be obtained, i.e. , fully vectorized, 
conditionally vectorized, short vector loop, or a vector loop 
replaced by a subroutine call. 

In examining Figures 5 and 6, it should be noted that, 
even for highly modular programs, the application of all 
vectorization rules which have been pointed out is a very 
tedious process. The vectorizing compiler provided by Cray, 
CFT or CFT77, performs well in finding vector constructs; 
however, it cannot perform as well as the vector programmer 
who carefully examines and optimizes codes to fully exploit 
the X-MP architecture. The following estimates conclude this 
example by calculating overall run-time improvement for 
RESPONS if the stated levels of improvement are achieved for 
subprograms : 

Estimate 25% improvement in ZMULX1 

Estimate 25% improvement in ZMULT 

Estimate 15% improvement in the other six 
predominant subroutines 

This yields and estimated overall improvement of 
(0.25) (0.64) + (0.15) (0.29) - 0.20, 
or 20 % improvement in a typical run stream. 


Figure 7 shows a long-term strategy which could be 
employed if a complete conversion to vectorized code is 
justifiable for the FORMA package. This flow chart 
represents a procedure which would be a greater expense and 
requires more time, but which would yield a thorough redesign 
of the software. 

A general optimization strategy is shown by the flow 
chart of Figure 6. This procedure is independent of the 
specific software package under consideration. Note that the 
procedure would require the implmentation of general purpose 
test and data generation programs to thoroughly test 
vectorization strategies. 
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5. OPTIMIZE 
SELECTED SUBPROGRAM 
UNDER CFT77 



6. PUBLISH BASELINE 
FORMA 
PACKAGE 


1 • 


\ 

END 

V J 

FIGURE 7. LONG-TERM STRATEGY 
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/ all \ 

'SUBPROGRAMS' 
\ OPTIMIZED / 


DOCUMENT 


FIGURE 8. GENERAL STRATEGY 
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CONCLUSION^ AND FUTURE DIR ECT ION 


Optimization of computer programs to achieve highly 
vectorized codes is a very exacting and time-consuming 
process. It is very much labor-intensive and it requires 
highly skilled personnel. On the other hand, these are 
rather costly attributes that must be balanced against the 
fact that software such as the FORMA routines are long-term 
investments. There are high initial costs associated with 
the optimization process, but there are long-term advantages 
to reducing CPU-minutes for frequently used programs. 

The FORMA software package would be an excellent 
candidate for long-term optimization procedures. If this is 
done, several key areas would need to be addressed. These 
are : 

o The CFT77 compiler should be used in generating 

object code. In doing this, complied codes should 
be compared with previous compilations to ensure the 
integrity of the compile process. 

o I/O utility routines are not particularly good 
candidates for optimization. However, these are 
frequently used routines and unique I/O speed-up 
features on the Cray should be investigated. These 
would include BUFFER IN/BUFFER OUT and unformatted 
I/O. 

o Custom performance monitoring routines should be 

implemented. These could provide users with a means 
to easily monitor performance enhancements and to 
monitor any difference in results obtained. 

o The optimization techniques which are effective tend 
to be reusable; that is, once learned or recognized, 
the same techniques can generally be applied a 
number of times in a given software package. 
Therefore, the more effective vectorization 
techniques should be well documented, including 
applicable performance statistics. 
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The Cray X-MP at NASA/MSFC represents a significant 
investment in high-performance computing technology. As 
such, resources to support this machine are critical. Those 
personnel writing new programs for the Cray X-MP should be 
well-versed in good vectorization techniques. In addition, 
permanent staff with in-depth knowledge of vectorization 
tools and techniques is important to the effective use of the 
present machine, as well as future upgrades and 
next-generation machines. 
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